Skip to content

DLPX-83442 Disable various kernel modules which we don't use#20

Merged
prakashsurya merged 1 commit into6.0/stagefrom
projects/ps-overrides
Nov 8, 2022
Merged

DLPX-83442 Disable various kernel modules which we don't use#20
prakashsurya merged 1 commit into6.0/stagefrom
projects/ps-overrides

Conversation

@prakashsurya
Copy link
Contributor

No description provided.

Copy link
Contributor

@sebroy sebroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to pass in these overrides as parameters to the build job? I'm just exploring whether there's a possibility of maintaining this list in a single place for all kernels as opposed to duplicating it in all of the kernel repos... (?)

@prakashsurya
Copy link
Contributor Author

Is there a way to pass in these overrides as parameters to the build job?

Hm.. I'm sure we could dynamically copy it into place as part of the linux-pkg build.. I do think it's a bit awkward to duplicate it in all our platform specific repositories, but at the same time, we do that already for all other kernel changes we make.. so I'm not sure if this specific change should be treated any differently.. but, perhaps we should.. I'm not sure I have an opinion yet, I see both sides having merit (duplicate in all kernel repos, or not)..

I'll think about this some more...

@prakashsurya prakashsurya force-pushed the projects/ps-overrides branch 5 times, most recently from d1fadbe to 32490f0 Compare September 30, 2022 15:30
@prakashsurya prakashsurya force-pushed the projects/ps-overrides branch 2 times, most recently from e62d47b to e92c3e5 Compare October 11, 2022 18:37
@prakashsurya
Copy link
Contributor Author

Finally got a build to work...

Before:

delphix@ip-10-110-235-80:~$ dpkg-query -L linux-modules-5.4.0-1085-dx2022092505-805ba8691-aws | grep \.ko$ | wc -l
943
delphix@ip-10-110-235-80:~$ dpkg-query -L linux-modules-extra-5.4.0-1085-dx2022092505-805ba8691-aws | grep \.ko$ | wc -l
2704

After:

delphix@ip-10-110-230-45:~$ dpkg-query -L linux-modules-5.4.0-1085-dx2022101118-e92c3e5a4-aws | grep \.ko$ | wc -l
926
delphix@ip-10-110-230-45:~$ dpkg-query -L linux-modules-extra-5.4.0-1085-dx2022101118-e92c3e5a4-aws | grep \.ko$ | wc -l
2568

Not a huge difference, but still something... probably more of a difference in the size, as it looks like we've removed some of the biggest modules..

Before:

$ find /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws -type f -name '*.ko' | xargs du -b | sort -n | tail
1517305 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
1651089 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/fs/ocfs2/ocfs2.ko
1856433 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/fs/cifs/cifs.ko
2046881 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/fs/xfs/xfs.ko
2126433 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/fs/btrfs/btrfs.ko
2315177 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/gpu/drm/radeon/radeon.ko
3340993 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/gpu/drm/i915/i915.ko
3543641 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/kernel/kheaders.ko
6170352 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/extra/zfs.ko
7149665 /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko

After:

$ find /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel -type f -name '*.ko' | xargs du -b | sort -n | tail
830921  /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/net/ethernet/qlogic/qed/qed.ko
1083481 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/arch/x86/kvm/kvm.ko
1100985 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/fs/nfs/nfsv4.ko
1141993 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko
1150657 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko
1183185 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/scsi/qla2xxx/qla2xxx.ko
1456257 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/scsi/lpfc/lpfc.ko
1516457 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
1856433 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/fs/cifs/cifs.ko
3536985 /usr/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/kernel/kheaders.ko

@sdimitro
Copy link
Contributor

@prakashsurya let's make the arguments for this PR more compelling to others :) - above you say that we removed some of the biggest modules but you showcase the directory that has the stripped binaries like this:

delphix@sd-DLPX-60981:~$ find /lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/ -type f -name '*.ko' | xargs du -b | sort -n | tail
1707441	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/lib/test_bpf.ko
1856497	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/fs/cifs/cifs.ko
2046881	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/fs/xfs/xfs.ko
2126433	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/fs/btrfs/btrfs.ko
2315177	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/gpu/drm/radeon/radeon.ko
3204409	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/gpu/drm/nouveau/nouveau.ko
3340929	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/gpu/drm/i915/i915.ko
3583897	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/kernel/kheaders.ko
6170352	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/extra/zfs.ko
7149681	/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko

At that point we see that the amdgpu.ko module takes around 7MB, but that's not the whole story. If you look at the directory with the debug info like this:

delphix@sd-DLPX-60981:~$ find /usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic -type f -name '*.ko' | xargs du -b | sort -n | tail
34235409	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/infiniband/hw/hfi1/hfi1.ko
36567320	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/staging/rtl8723bs/r8723bs.ko
41266553	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/net/mac80211/mac80211.ko
47682657	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/fs/xfs/xfs.ko
53728001	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/gpu/drm/radeon/radeon.ko
78037769	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
116411752	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/extra/zfs.ko
128963817	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/gpu/drm/i915/i915.ko
222555153	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/gpu/drm/nouveau/nouveau.ko
226726577	/usr/lib/debug/lib/modules/5.4.0-126-dx2022092505-cd7eac0b7-generic/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko

You can see that the same module with its debug info that's part of our images are 226MB, and the second one is 222MB (which we also strip). I'd guess that with your changes we save more than 1 GB in debug info and probably close to 50~100MB of unstripped data.

Other points include:

  • Saving memory in our crash kernel - the crash kernel doesn't have to load all these modules, and its always loaded up in memory to get crash dumps. With this change you help us lower its memory consumptions and stirs us further away from OOM bugs when capturing crash dumps
  • The storage saved above is not just from the rpool, but also from the images that we sent to our customers, including upgrade images.
  • Appliance build time - we don't spend time building all these big graphics cards and Ubuntu's ZFS module. (Do you have any idea how much time we save there?)

@prakashsurya
Copy link
Contributor Author

@sdimitro yea, for sure.. you're definitely right..

Before:

$ find /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws -type f -name '*.ko' | xargs du -b | sort -n | tail
27737817        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/fs/cifs/cifs.ko
34217553        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/fs/btrfs/btrfs.ko
34224025        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/infiniband/hw/hfi1/hfi1.ko
41256625        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/net/mac80211/mac80211.ko
47670969        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/fs/xfs/xfs.ko
53705529        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/gpu/drm/radeon/radeon.ko
78012961        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
116410920       /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/extra/zfs.ko
128893905       /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/gpu/drm/i915/i915.ko
226634721       /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko

After:

$ find /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws -type f -name '*.ko' | xargs du -b | sort -n | tail
15955793        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/fs/nfsd/nfsd.ko
16613697        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/net/sunrpc/sunrpc.ko
17090129        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/net/rxrpc/rxrpc.ko
18339633        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/fs/nfs/nfsv4.ko
20836289        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/net/sctp/sctp.ko
26590161        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/net/ethernet/netronome/nfp/nfp.ko
26922337        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko
27724657        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/fs/cifs/cifs.ko
77962217        /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
116386800       /usr/lib/debug/lib/modules/5.4.0-1085-dx2022101118-e92c3e5a4-aws/extra/zfs.ko

Saving memory in our crash kernel

👍 agreed.

The storage saved above is not just from the rpool, but also from the images that we sent to our customers, including upgrade images.

yea, for sure.. and it's a 5x savings here, since we have a kernel package for each platform, all bundled into a single upgrade image.. so, eliminating a large binary in the kernel, will remove it from all 5 kernel packages for our 5 current platforms..

Appliance build time - we don't spend time building all these big graphics cards and Ubuntu's ZFS module. (Do you have any idea how much time we save there?)

I think you mean, kernel build time, right? i.e. the time it takes to build the kernel package.. since the appliance build, just consumes the pre-built kernel package.. we still may save time in the appliance build, though, but that'll be due to a smaller package size, not compile time savings..

@prakashsurya
Copy link
Contributor Author

@sdimitro also something I was curious about.. there's a "linux-modules-extra" package, that contains lots of kernel modules.. I'm curious if we need to install that package on our appliance? do you know?

I've opened this: delphix/delphix-kernel#20 to see if maybe we can remove that package wholesale.. but I'm not yet sure if that's safe to do..

From what I can tell, any module that's built, but not in the "inclusion list" will wind up in this "extra" package.. and from just a number of modules perspective, there's a lot more modules in the "extra" package, than the regular one.. e.g.

$ dpkg-query -L linux-modules-5.4.0-1085-dx2022101118-e92c3e5a4-aws | grep \.ko$ | wc -l
926

$ dpkg-query -L linux-modules-extra-5.4.0-1085-dx2022101118-e92c3e5a4-aws | grep \.ko$ | wc -l
2568

I'm still investigating if there's any modules we care about in the "extra" package, though...

@prakashsurya
Copy link
Contributor Author

Hm.. looks like even without the "extra" package, we still get all of the debug modules:

$ find /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws -type f -name '*.ko' | wc -l
3649

as there only seems to be a single package for the debug modules:

$ dpkg-query -S /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws
linux-image-5.4.0-1085-dx2022092505-805ba8691-aws-dbgsym, zfs-modules-5.4.0-1085-dx2022092505-805ba8691-aws-dbg: /usr/lib/debug/lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws

@prakashsurya
Copy link
Contributor Author

$ find /lib/modules/5.4.0-1085-dx2022092505-805ba8691-aws -type f -name '*.ko' | wc -l
946

it does significantly reduce the number of non-debug modules on the system, though..

@sdimitro
Copy link
Contributor

@prakashsurya

Appliance build time - we don't spend time building all these big graphics cards and Ubuntu's ZFS module. (Do you have any idea how much time we save there?)

I think you mean, kernel build time, right? i.e. the time it takes to build the kernel package.. since the appliance build, just consumes the pre-built kernel package.. we still may save time in the appliance build, though, but that'll be due to a smaller package size, not compile time savings..

Yeah, apologies. I meant the kernel build time.

@sdimitro also something I was curious about.. there's a "linux-modules-extra" package, that contains lots of kernel modules.. I'm curious if we need to install that package on our appliance? do you know?

Unfortunately I don't. Looking at the file contents of the deb file it looks like there is a lot of stuff geared towards desktop users but I'd be hesitant to exclude it. My fear is that some driver may be part of this deb package that we may need to have on the VM even if we are not currently using. The best example that I have for that was the ENA network driver on AWS. We always shipped with that driver even though it wasn't used - then when customers wanted to enable the driver for faster networking, the just had to reboot their VMs - no hotfix, no upgrade involved.

BTW I noticed on your output above that we still compile and keep around Ubuntu's ZFS module in our build. Any way we can stop that from compiling?

@prakashsurya
Copy link
Contributor Author

BTW I noticed on your output above that we still compile and keep around Ubuntu's ZFS module in our build. Any way we can stop that from compiling?

yea, I haven't gotten to that yet, but we definitely want to remove ZFS from the kernel package.. I just haven't spent the time to figure out where to do that.. but it's on my list..

The best example that I have for that was the ENA network driver on AWS.

sure, but at the same time.. we'd want to weigh the benefit of removing it, with the potential cost of it maybe it being useful in the future.. especially considering, it'd only take an upgrade+reboot to re-instate any module that we removed..

but yea, I see your point.

@prakashsurya prakashsurya force-pushed the projects/ps-overrides branch 12 times, most recently from da0c35f to 3d867da Compare October 17, 2022 18:30
@prakashsurya prakashsurya force-pushed the projects/ps-overrides branch from d8214e2 to 1f0201f Compare November 8, 2022 17:41
@prakashsurya prakashsurya merged commit bf86ca4 into 6.0/stage Nov 8, 2022
@prakashsurya prakashsurya deleted the projects/ps-overrides branch November 8, 2022 18:59
delphix-devops-bot pushed a commit that referenced this pull request Aug 16, 2023
BugLink: https://bugs.launchpad.net/bugs/2023230

[ Upstream commit 4e264be ]

When a system with E810 with existing VFs gets rebooted the following
hang may be observed.

 Pid 1 is hung in iavf_remove(), part of a network driver:
 PID: 1        TASK: ffff965400e5a340  CPU: 24   COMMAND: "systemd-shutdow"
  #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
  #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
  #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
  #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
  #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
  #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
  #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
  #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
  #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
  #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
 #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
 #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
 #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
 #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
 #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
     RIP: 00007f1baa5c13d7  RSP: 00007fffbcc55a98  RFLAGS: 00000202
     RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f1baa5c13d7
     RDX: 0000000001234567  RSI: 0000000028121969  RDI: 00000000fee1dead
     RBP: 00007fffbcc55ca0   R8: 0000000000000000   R9: 00007fffbcc54e90
     R10: 00007fffbcc55050  R11: 0000000000000202  R12: 0000000000000005
     R13: 0000000000000000  R14: 00007fffbcc55af0  R15: 0000000000000000
     ORIG_RAX: 00000000000000a9  CS: 0033  SS: 002b

During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.

Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().

Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove")
Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants